[UmbracoExamine] (InternalIndexer)Error indexing queue items,read past EOF
We have a site that has a large amount of nodes, as well as a large amount of content updates via a custom approval process. In this custom approval process, multiple users of a certain type can approve nodes created by users of a different type.
We are experiencing the following issue with the Standard Umbraco indexes as well as all our our custom indexes, the following example is the InternalIndexSet:
2013-10-17 14:36:19,597 [5] INFO umbraco.BusinessLogic.Log - [Thread 85] Redirected log call (please use Umbraco.Core.Logging.LogHelper instead of umbraco.BusinessLogic.Log) | Type: Error | User: 0 | NodeId: -1 | Comment: [UmbracoExamine] (InternalIndexer)Error indexing queue items,read past EOF, IndexSet: InternalIndexSet
We can't seem to find the root cause. We've forced the indexes to rebuild multiple times via deleting all of the index files and rebuilding the indexes but this issue keeps cropping up, usually only a day after we've forced the rebuild.
I was able to find that we had 0k sized segments_* files, and that deleting these allow umbraco and Luke to read the indexes again. From what I understand these can be caused by abandoned processes that empty the segments files but never delete them.
Is there any work around for this? Has anyone else run into this same issue? Is it possible this is happening because we could have multiple users saving/updating different content at once?
We are still having this problem. In order to resolve and keep the parts of our site that use indexes still working, we have been removing the 0k segments files manually. This seems to work for only a a short time (about 3-4 hours) until some event that triggers the index to optimize brings it up again. This is not a viable permanent solution, just a temporary one to keep critical parts of the site functioning (location searching, content approval, etc)
We were able to save some content and see the index optimize into a new segments file, and delete the original segments file wihout issue. So this is not happening every time the index optimizes itself - but we can't seem to find a pattern related to a specific user/content node that triggers it.
My suspicion is still that the old segments files are being locked and not deleted, has this behavior been seen in this version of Umbraco before?
We are pretty sure we have found the cause, although we are not able to fully explain the behavior. We had an hourly shadow copy backup enabled on the location where we were hosting the site, which we enabled at the same time this issue started appearing.
The backup ran at 12 minutes past the hour every hour, which is when during periods of significant content modification, the error I posted above would be logged and an empty segments file would be left over in some (not always all) of the index folders, most notably the InternalIndex. During periods of high traffic/content modification, this would happen every hour, but during slower periods, it had gone up to 4 days without having any issue.
From my research into how Lucene works, during index optimization the older empty index file can be locked from deletion because there is another process accessing or attempting to write to it.
All signs point to the hourly backup we had enabled being the culprit, but what I don't understand is how that would affect the Lucene index optimization and files, as our Admin who configured the site said that shadow copy is not supposed to lock files.
We've changed the backup to only occur during non-peak hours, which is working for now, but I'd still like to understand why it would cause these issues.
We've determined that this is definitely being cause by our RackSpace shadow copy, even in off hours we've seen the empty segments issue happen once in the last 3 days. Any help/input/ideas would be greatly appreciated.
We reconfigured our Rackspace backup settings and have not seen this issue in over two weeks. I wanted to post the solution here in case anyone else runs into this.
We performed the following steps:
Excluded the Database Data and Log file directories from the file backup
Created a SQL job to make a .bak file for the SQL backup
Changed the SQL backup job to run before the file backup, and set them both to occur during non-peak hours.
[UmbracoExamine] (InternalIndexer)Error indexing queue items,read past EOF
We have a site that has a large amount of nodes, as well as a large amount of content updates via a custom approval process. In this custom approval process, multiple users of a certain type can approve nodes created by users of a different type.
We are experiencing the following issue with the Standard Umbraco indexes as well as all our our custom indexes, the following example is the InternalIndexSet:
We can't seem to find the root cause. We've forced the indexes to rebuild multiple times via deleting all of the index files and rebuilding the indexes but this issue keeps cropping up, usually only a day after we've forced the rebuild.
I was able to find that we had 0k sized segments_* files, and that deleting these allow umbraco and Luke to read the indexes again. From what I understand these can be caused by abandoned processes that empty the segments files but never delete them.
Is there any work around for this? Has anyone else run into this same issue? Is it possible this is happening because we could have multiple users saving/updating different content at once?
Thanks, Robert
Robert,
What version of umbraco are you using?
Regards
Ismail
Hello Ismail,
This particular site is using 6.0.5, sorry for not including this in my first post.
-Robert
We are still having this problem. In order to resolve and keep the parts of our site that use indexes still working, we have been removing the 0k segments files manually. This seems to work for only a a short time (about 3-4 hours) until some event that triggers the index to optimize brings it up again. This is not a viable permanent solution, just a temporary one to keep critical parts of the site functioning (location searching, content approval, etc)
We were able to save some content and see the index optimize into a new segments file, and delete the original segments file wihout issue. So this is not happening every time the index optimizes itself - but we can't seem to find a pattern related to a specific user/content node that triggers it.
My suspicion is still that the old segments files are being locked and not deleted, has this behavior been seen in this version of Umbraco before?
Thanks, Robert
We are pretty sure we have found the cause, although we are not able to fully explain the behavior. We had an hourly shadow copy backup enabled on the location where we were hosting the site, which we enabled at the same time this issue started appearing.
The backup ran at 12 minutes past the hour every hour, which is when during periods of significant content modification, the error I posted above would be logged and an empty segments file would be left over in some (not always all) of the index folders, most notably the InternalIndex. During periods of high traffic/content modification, this would happen every hour, but during slower periods, it had gone up to 4 days without having any issue.
From my research into how Lucene works, during index optimization the older empty index file can be locked from deletion because there is another process accessing or attempting to write to it.
All signs point to the hourly backup we had enabled being the culprit, but what I don't understand is how that would affect the Lucene index optimization and files, as our Admin who configured the site said that shadow copy is not supposed to lock files.
We've changed the backup to only occur during non-peak hours, which is working for now, but I'd still like to understand why it would cause these issues.
We've determined that this is definitely being cause by our RackSpace shadow copy, even in off hours we've seen the empty segments issue happen once in the last 3 days. Any help/input/ideas would be greatly appreciated.
We reconfigured our Rackspace backup settings and have not seen this issue in over two weeks. I wanted to post the solution here in case anyone else runs into this.
We performed the following steps:
Excluded the Database Data and Log file directories from the file backup
Created a SQL job to make a .bak file for the SQL backup
Changed the SQL backup job to run before the file backup, and set them both to occur during non-peak hours.
Thanks, Robert
is working on a reply...